Skip to content

[Parquet] route dictionary page through the PageStore#10142

Open
liamzwbao wants to merge 2 commits into
apache:mainfrom
liamzwbao:issue-10062-move-dict-in-page-store
Open

[Parquet] route dictionary page through the PageStore#10142
liamzwbao wants to merge 2 commits into
apache:mainfrom
liamzwbao:issue-10062-move-dict-in-page-store

Conversation

@liamzwbao

@liamzwbao liamzwbao commented Jun 15, 2026

Copy link
Copy Markdown
Contributor

Which issue does this PR close?

Rationale for this change

This addresses step B of #10062: make the store handle dictionary page the same way as data pages.

What changes are included in this PR?

  • Replaced the dictionary: Vec<Bytes> field with a tracked dictionary_key: Option<PageKey> (plus dictionary_len). The dictionary page is now put into the PageStore like any other page; only its handle is held apart so it can be taken back first at splice.
  • StreamingColumnChunkReader now drains every page (dictionary first, then data) from the store.

Are these changes tested?

Yes, also a new test added to ensure dict page is in page store

Are there any user-facing changes?

No

@github-actions github-actions Bot added the parquet Changes to the parquet crate label Jun 15, 2026
@liamzwbao liamzwbao marked this pull request as ready for review June 16, 2026 04:32
@liamzwbao

Copy link
Copy Markdown
Contributor Author

Hi @adriangb @alamb, this should be able to resolve the step B mentioned in the issue. Local benchmark seems fine. PTAL, thanks!

@adriangb adriangb left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice! I wonder if we could update any of the existing tests to show even lower memory usage?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

parquet Changes to the parquet crate

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants